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WHAT TS CLAIMED IS: 

^^/\L . A document processor which displays and prints in a 



> 
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:edetermined format a plurality of document data input thereto, 
comprising : 

document memory which stores input document data; 
;election unit which selects all or part of document data 
stored \n said documents memory; 

characteristics extraction unit which extracts data 
relating ta characteristics of letter rows from all or part of 
the document\ data selected by said selection unit; 

work processing unit which work-processes all or part of 
the document datV based on the data relating to characteristics 
of letter rows extracted by said characteristics extraction 
unit; and 

output unit whiVh outputs all or part of the document data 
work-processed by saia work processing unit 



2 . The document processor according to claim 1, wherein said 
output unit comprises item value set unit which sets a plurality 
20 of item values based on the contents of all or part of the 
document data work-processed by sSaid work-processing unit; and 
totalization unit which totalizes Tall or part of the document 
data for each item value set by sard item value set unit; 

said output unit outputs all or pVrt of the document data 
25 in the format of a table having an item\alue as at least one 
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axis . 



3. The document processor according to claim 1, wherein said 
output unit outputs all or part of the document data work- 
processed by saick work processing unit together with all or part 
of the document data in its state prior to work-processing by- 
said work processing unit. 



4 . The document processor according to claim 1, wherein said 
10 document memory further stores all or part of the document data 
work-processed by said\work processing unit. 

5. The document processor according to claim 1, wherein said 
selection unit further selects all or part of the document data 
15 output by said output unit. 
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6. The document processor according to claim 1, wherein said 
document memory further stores dat^ relating to contents of the 
work processing. 
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A document classification device which classifies 



documents based on contents 

input unit which inputs 
language analyzer unit whic 

by said input unit and obtains 1 
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rising : 
ta; 

s document data input 
ge analysis information; 
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vector creation unit which obtains document 
characteristic vectors for the document data based on the 
language analysis information obtained by said language 
analyzer unit; 

classification unit which classifies documents based on 
the degree of Similarity between document characteristic 
vectors created b^ said vector creation unit, and creating 
clusters of documents ; 

cluster characteristics calculation unit which 
calculates cluster characteristics , /which are characteristics 



s cj 



of clusters of documen 
and 

classification c 
characteristics , calcul 
calculation unit, as con 
categories . 



>y said classification unit; 

\go/y memory which stores cluster 
by said cluster characteristics 
:uent elements of classification 
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A document class 



iWc^ti 

\ 



on device which classifies 



documents based on contents thereof comprising: 
input unit which inputs b document data; 



by sai 



language analyzer unit which analyzes document data input 
id input unit and obtains language analysis information; 
vector creation unit \wnich creates document 



characteristic vectors for the doc^i^ient data based on the 
language analysis information obtai v n\d by said language 
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analyzer unit, 

classification unit which classifies documents based on 
the degree of >similarity between document characteristic 
vectors created W said vector creation unit, and creates 
clusters of documents; 

cluster characteristics calculation unit which 
calculates cluster characteristics, which are characteristics 
of clusters of documents created by said classification unit; 

display unit whiqh displays the cluster characteristics 

er characterist: 



10 calculated by said clus 
cluster selection 
predetermined clusters f 
said classification unit; \and 
classification catelg^ry 
15 characteristics, calculate 



teristics / calculation unit; 
specification /unit which selects 
;luster qf£ documents created by 



Smof^ whict/ stores cluster 
said cluster characteristics 



calculation unit, as constituent elements of classification 
categories . 



9. The document classification device according to claim 8, 
20 further comprising document characteristic vector memory which 
stores document characteristic vectors created by vector 
creation unit ; and 

vector correction unit which corrects document 
characteristic vectors stored in said document characteristic 
25 vector memory, so that document characteristic vectors of 
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documents belonging to clusters selected by said cluster 
selection unit are deleted; 

said\ classification unit which classifies documents 
based on the document characteristic vectors corrected by said 
vector* correction unit. 
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10. The document classification device according to claim 8, 
further comprising document characteristic vector memory which 
stores document\ characteristic vectors created by vector 
creation unit; anV 

document exp&esfi^kon space correction unit which corrects 
document expression s,pace\when determining the degree of 



\ 



similarity between document cfrar?acteristic vectors stored in 
said document charac\e\\stic/ vectors memory, based on a 
characteristics amount ^ai&ulated from clusters selected by 
said cluster selection umit; 

said classification unit classify the documents based on 
the degree of similarity Mbetween document characteristic 
vectors created by said vectorYireation unit, using the document 
expression space corrected by s^id document expression space 
correction unit. 



11. The document 
further comprising 
25 stores document cha 




devifce according to claim 9, 
teristic vector memory which 
vectors created by vector 



creation unit; and 

document expression space correction unit which corrects 
the document expression space when determining the degree of 
similarity Vbetween document characteristic vectors stored in 
said document characteristic vectors memory, based on a 
characteristics amount calculated from clusters selected by 
said cluster selection unit; 

said classification unit classify the documents based on 
the degree of Similarity between document characteristic 
vectors created by said vector creat:L0h unit, using the document 
expression space carcs^ected by saad document expression space 
correction unit, u \. / 

12. The document c\LWssi/f ication device according to claim 8, 



further comprising se/lefction information appending unit which 



appends selection yrraormation showing the fact of selection 
when all or part ef the documents belonging to a cluster of 
documents created by \said classification unit have been 
selected; \ 

said display unit displays the cluster characteristics, 
and also displays the selection information appended by said 
selection information appending unit. 




13. The document classification device according to 
wherein said classification category memory stores 



claim 
clust 



8, 
er 



characteristics and/or information created by an operator, in 
addition\to all or part of the documents belonging to a cluster 
of documents selected by said selection specification unit, as 
constituent elements of classification categories. 
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14. A document classification device which classifies 
document clusters in accordance with contents thereof 
comprising : 

document Vlnput unit which inputs document data groups; 
document dividing unit which divides document data into 
one or multiple divided documer^f data based on a predetermined 
reference ; 



document -divided 



;nt map creation unit which 



25 



creates a map showing the/correspondence between the document 
data and the divide^^q^ument data; 

divided docum^nA\classif ication unit which classifies 
the divided documerfj^ d^ta; 

divided document Classification result creation unit 
which creates divided \ document classification result 
information based on a classification result of said divided 
document classification uniV:; and 

document classification result creation unit which 
creates classification result information of the above document 
data using the document-divided\document map and the divided 
document classification result information. 
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15 . I The document classification device according to claim 14 , 
furrher comprising document save unit which saves the document 
data; 

divided document save unit which saves the divided 
document! data; and 

document-divided document map save unit which saves a 
document-dYvided document map created by said document-divided 
document map creation unit. 



16. The document classi 
further comprising divi 
unit which saves\the di 
information created by s 
result creation uni\ . 




yice according to claim 15, 
lassif ication result save 
document classification result 
divided document classification 



17 . The document classification device according to claim 14 , 
wherein a plurality of dwided document data created by said 
document dividing unit comprises the document data in its state 
prior to being divided. 



18 . The document classi f icatJ\pn device according to claim 14 , 
wherein said document dividing unVt divides document data based 
on information relating to the structure of the document data. 



10 



19 . The document classification device according to claim 14 , 
further comprising document element extraction unit which 
extracts elements in the document data; 

element-accompanying information extraction unit which 
extracts elemenk-accompanying information accompanying the 
elements extracted by said document element extraction unit; 

said document dividing unit divides the document data 
using elements extracted by said document element extraction 
unit, or the elements and element-accompanying information 
extracted by said elemeVit-accompanying information extraction 
unit . 
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20 . The document classif 3^ation? / device according to claim 14 , 
wherein said document dividamg unit divides the document data 
in compliance with a spe^eifi^d specification range. 
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21 . The document classification device according to claim 14 , 
wherein said document dividing unYt divides the document data 
based on the number of letters, the number of sentences, or both 
the number of letters and the number of sentences. 



25 



22 . The document classification deviate according to claim 14, 
wherein said document classification Yresult creation unit 
extracts and presents information showirra document data, and 
representative information accompanying tl^e document data, as 
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classification result information. 



23 . \he document classification device according to claim 22 , 
wherein\said document classification result creation unit 
extracts and presents information showing divided document data, 
and representative information accompanying the divided 
document dana, as classification result information. 



24. A document processing method which outputs a plurality 
of input documenadata in order to display or print the document 
data in a predetermined format, comprising the steps of: 
storing inpul^de-e^iment 

selecting all o\ parfvof .th$^docHment data stored in the 
storing step; 

extracting datk rfelat/nq t^gKaracteristics of letter 
rows from all or part Q/Wthe document data selected in the 
selection step; 

work-processing all oV part of the document data based 
on the data relating to characteristics of letter rows extracted 
in the characteristics extraction step; and 

outputting all or part \ of the document data work- 
processed in the work-processing step. 



25. The document processing methted according to claim 24, 
wherein the step of outputting comprises, the steps of setting 



a plurality of item values based on the contents of all or part 
of the\document data work-processed in the work-processing 
step; and totalizing all or part of the document data for each 
item value set by in the item value setting step; and 

outputs all or part of the document data in the format 
of a table Yhaving an item value as at least one axis 



26. The document processing method according to claim 24, 
wherein the step of outputting comprises outputting all or part 
10 of the documentXdata work-processed in the work-processing step 
together with alV or part of the documejat data in its state prior 
to work-processing lin tfte work-pr^ce_s_sjjnxj step. 



27. The document\ processij^g method according to claim 24, 



\ 



15 wherein the step ofxstiorinKj further comprises storing all or 
part of the document di^i t^/work-processed in the work-processing 
step . 



28. The document processing method according to claim 24, 
20 wherein the step of selecting further comprises selecting all 
or part of the document data output in the output step. 



25 



29. The document processing method according to claim 24, 
wherein the step of storing a document further comprises storing 
data relating to contents of Ahe work processing. 
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A document classification method which classifies 
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documei 
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thereof comprising the steps of: 
Lata; 

document data input in the step of 
ge analysis information; 
acteristic vectors for the 
anguage analysis information 
ge -analyzing ; 

on the degree of similarity 
vectors created in the step of 
clusters of documents; 

characteristics , being 
documents created in the step 

istics, calculated in the step 
acteristics, as constituent 
egories . 



31- A document\ classification method of classifying 
documents based on contents thereof, comprising the steps of: 
inputting a document data; 
language-analyzYng document data input in the step of 
inputting and obtaining language analysis information; 

creating document^ characteristic vectors for the 
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document , data based on the language analysis information 

\ 

obtained VLn the step of language-analyzing; 

classifying documents based oh the degree of similarity 
between document characteristic vectors created in the step of 
creating vectors, and creating clusters N of documents; 

calculating cluster characteristics, which are 
characteristics of clusters of documents created in the step 



/ 



of classifying 



15 



displaying the cluster characteristics calculated in the 
step of calculating cluster characteristics; 

selecting predetermined-^) clusters from cluster of 
documents created in/the step of classifying; and 

storing cluster characteristics, calculated in the step 

y 

of calculating/" cluster characteristics, as constituent 
elements 



iiapng/ c_LUister cnaracterisi- 
of classification categories. 



32 . The document classification method according to claim 31 , 
further comprising the step of correcting document 
characteristic vectors sAored in the step of storing document 
20 characteristic vectors, \so that document characteristic 
vectors of documents belonging to clusters selected by the step 
of selecting clusters are deleted; 

the step of classifying comprising classifying documents 
based on the document characteristic vectors corrected by the 
25 step of correcting vectors. 
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3 3 . The document classification method according to claim 31, 
further comprising a step of correcting document expression 
space when determining the degree . of similarity between 
5 document characteristic vectors stored in the step of storing 
document characteristics vec^c5rs , based on a characteristics 
amount calculated from cIusIl^ts selected in the step of 
selecting clusters; 

the steppf Classifying comprising classifying documents 
10 based on the, degree of similarity between document 
characteristic vectors created in the step of creating vectors, 
using the document expression space corrected in the step of 
correcting the document expression space. 



15 34 . The document classification method according to claim 32, 

further comprising a step of correcting document expression 

space when determining the degree of similarity between 

document characteristic vectors stored in the step of storing 

document characteristic vectors, £>ased on a characteristics 

20 amount calculated \firom\clus t^rs selected in the step of 

XX 

selecting clusters ; 

the step of clas^p^y v ing comprising classifying documents 
based on the degree\ N^s£ similarity between document 
characteristic vectors created in the step of creating vectors, 
25 using the document expression space corrected in the step of 
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correcting the document expression space, 



10 



35 . The\document classification method according to claim 31, 
further comprising the steps of appending selection information 
showing the\fact of selection when all or part of the documents 
belonging t\ a cluster of documents created in the step of 
classifying nave been selected; 

the stepVof displaying comprising displaying the cluster 
characteristics^ and also displaying the selection information 
appended in the step of appending selection information. 



36 . The document classification ^vice/according to claim 31, 
wherein the step &f creating cl^sif ication categories 
comprises creating cli^ter charkcte^astibsp and/or information 
15 created by an operato^, in a^idim.on to all or part of the 
documents belonging to a\ cluster Qf documents selected in the 
step of specifying selection, \as\ constituent elements of 
classification categories 



20 37. A document classification method which classifies 
document clusters in accordance with contents thereof 
comprising the steps of: 

inputting document data groups; dividing document data 
into one or multiple divided document data based on a 
25 predetermined reference; creating a map showing the 
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correspondence between the document data and the divided 
document data; classifying the divided document data; creating 
divided\document classification result information based on the 
classification result of classifying the divided documents; and 
creating classification result information of the document data 
using the \document-divided document map and the divided 
document classification result information. 
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38. A computer-readable recording medium in which is stored 
programs for executing a document classification method, which 
document classification Imethod comprising the steps of: 



storing inpuV document data; 

selecting all \r patqt of the document data stored in the 
storing step; 

extracting data fcelati¥g to characteristics of letter 
rows from all or part o% the apcument data selected in the 
selection step; 

work-processing all oV part of the document data based 
on the data relating to characteristics of letter rows extracted 
in the characteristics extraction step; and 

outputting all or part of \ the document data work- 
processed in the work-processing step. 



39. A computer-reada 
25 programs for executing a 




ding medium in which is stored 
t classification method, which 



document classification method comprising the steps of: 
inputting a document data; 

language-analyzing document data input in the step of 
inputting ancft obtaining language analysis information; 

creating\ document characteristic vectors for the 
document data based on the language analysis information 
obtained in the Vtep of language-analyzing; 

classif ying\documents based on the degree of similarity 
between document characteristic vectors created in the step of 
creating vectors, amd creating clusters of documents; 

calculating \ cluster characteristics, being 

characteristics of clusters of documents created in the step 
of classifying; and \ 

storing cluster characteristics, calculated in the step 
of calculating clustery characteristics, as constituent 
elements of classification categories. 

40. A computer-readable recording medium in which is stored 
programs for executing a document classification method, which 
document classification method comprising the steps of: 
inputting a document datay 

language-analyzing document data input in the step of 
inputting and obtaining language analysis information; 

creating document characteristic vectors for the 
document data based on the languages analysis information 



obtained in the step of language-analyzing; 

classifying documents based on the degree of similarity 
between document characteristic vectors created in the step of 
creating vectors, and creating clusters of documents; 
5 calculating cluster characteristics, which are 

characteristics \of clusters of documents created in the step 
of classifying; 

displaying the cluster characteristics calculated in the 
step of calculating cluster characteristics; 
10 selecting predetermined clusters from cluster of 

documents created in the step of classifying; and 

storing cluster characteristics, calculated in the step 
of calculating cluster characteristics, as constituent 
elements of classification categories. 



41. A computer-readable recording medium in which is stored 
programs for executing a document classification method, which 
document classification method comprising the steps of: 

inputting document data groups; dividing document data 

20 into one or multiple divided document data based on a 
predetermined reference; areating a map showing the 
correspondence between the apcument data and the divided 
document data; classifying the aivided document data; creating 
divided document classification iesult information based on the 

25 classification result of classifying the divided documents; and 
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creating classification result information of the document data 
using the document-divided document map and the divided 
document classification result information. 
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